{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Trainer\n",
    "千帆Python SDK 在使用[resource API实现发起训练微调](./api_based_finetune.ipynb)之外，还提供了Trainer API，可以更方便地实现一体化的训练微调pipeline。同时提供了状态事件回调函数的注册，通过事件分发实现训练流程状态事件的监控。\n",
    "\n",
    "\n",
    "本例将基于qianfan==0.3.15展示通过Dataset加载本地数据集，并上传到千帆平台，基于ERNIE-Speed-8K进行fine-tune，并使用Model进行批量跑评估数据，直到最终完成服务发布，并最终实现服务调用的完整过程。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "! pip install \"qianfan>=0.3.15\" -U"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'0.3.15'"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import qianfan\n",
    "qianfan.__version__"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 前置准备\n",
    "- 初始化千帆安全认证AK、SK"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os \n",
    "\n",
    "os.environ[\"QIANFAN_ACCESS_KEY\"] = \"your_ak\"\n",
    "os.environ[\"QIANFAN_SECRET_KEY\"] = \"your_sk\""
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### 导入依赖\n",
    "- `qianfan.trainer.consts` trainer使用中所用到的常量\n",
    "- `qianfan.resources.console.consts` api层面定义的字段常量\n",
    "- `qianfan.trainer.configs` trainer使用所需要的config配置数据类\n",
    "- `qianfan.resources.QfMessages` 用于组装qianfan.ChatCompletion的输入messages\n",
    "- `qianfan.trainer.LLMFinetune` 大语言模型fine-tune任务Trainer实现\n",
    "- `qianfan.dataset.Dataset` 千帆dataset类，用于管理千帆平台、本地、第三方数据集的导入导出，数据清洗等操作"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
    "from qianfan.trainer.consts import ActionState\n",
    "from qianfan.model.consts import ServiceType\n",
    "from qianfan.resources.console import consts as console_consts\n",
    "from qianfan.trainer.configs import TrainConfig, DatasetConfig\n",
    "from qianfan.model.configs import DeployConfig\n",
    "from qianfan.resources import QfMessages\n",
    "from qianfan.trainer import Finetune\n",
    "from qianfan.dataset import Dataset\n",
    "from qianfan.utils import enable_log\n",
    "import logging\n",
    "\n",
    "enable_log(logging.INFO)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 数据集加载\n",
    "\n",
    "千帆SDK提供了数据集实现帮助我们可以快速的加载本地的数据集到内存，并通过设定DataSource数据源以保存至本地和千帆平台。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "[INFO][2024-06-14 10:11:20.926] dataset.py:408 [t:8344509248]: no data source was provided, construct\n",
      "[INFO][2024-06-14 10:11:20.930] dataset.py:276 [t:8344509248]: construct a file data source from path: ./data/fin_cqa_train.jsonl, with args: {}\n",
      "[INFO][2024-06-14 10:11:20.938] file.py:293 [t:8344509248]: use format type FormatType.Jsonl\n",
      "[INFO][2024-06-14 10:11:20.950] utils.py:349 [t:8344509248]: start to get memory_map from .qf_cache/dataset/Users/zhonghanjun/pywp/bce-qianfan-sdk/cookbook/finetune/data/fin_cqa_train.arrow\n",
      "[INFO][2024-06-14 10:11:20.972] utils.py:277 [t:8344509248]: has got a memory-mapped table\n",
      "[INFO][2024-06-14 10:11:20.984] dataset.py:994 [t:8344509248]: list local dataset data by None\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[[{'prompt': '下文中市场价格下降导致市场价格下降事件对应的结果涉及的产品是？在国际奶粉价格下降压力下,国内奶价仍有下降空间',\n",
       "   'response': [['奶']]}],\n",
       " [{'prompt': '下文中市场价格下降导致市场价格下降事件对应的结果涉及的地区是？在国际奶粉价格下降压力下,国内奶价仍有下降空间',\n",
       "   'response': [['国内']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？随着供应缩减，需求回升，高碳铬铁价格企稳回升，预期短期内或仍维持偏强态势，但高碳铬铁产能依然偏向过剩，后期价格弹升空间有限',\n",
       "   'response': [['需求增加导致市场价格提升']]}],\n",
       " [{'prompt': '下文中需求增加导致市场价格提升事件对应的原因涉及的产品是？随着供应缩减，需求回升，高碳铬铁价格企稳回升，预期短期内或仍维持偏强态势，但高碳铬铁产能依然偏向过剩，后期价格弹升空间有限',\n",
       "   'response': [['高碳铬铁']]}],\n",
       " [{'prompt': '下文中需求增加导致市场价格提升事件对应的原因涉及的行业是？随着供应缩减，需求回升，高碳铬铁价格企稳回升，预期短期内或仍维持偏强态势，但高碳铬铁产能依然偏向过剩，后期价格弹升空间有限',\n",
       "   'response': [['我们无法得知，可能需要更多内容说明。']]}],\n",
       " [{'prompt': '下文中需求增加导致市场价格提升事件对应的结果涉及的产品是？随着供应缩减，需求回升，高碳铬铁价格企稳回升，预期短期内或仍维持偏强态势，但高碳铬铁产能依然偏向过剩，后期价格弹升空间有限',\n",
       "   'response': [['高碳铬铁']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？但由于7-ACA市场价格大幅下跌，跌幅超过50%，导致公司相关原料药销售价格随之下跌',\n",
       "   'response': [['市场价格下降导致市场价格下降']]}],\n",
       " [{'prompt': '下文中市场价格下降导致市场价格下降事件对应的原因涉及的产品是？但由于7-ACA市场价格大幅下跌，跌幅超过50%，导致公司相关原料药销售价格随之下跌',\n",
       "   'response': [['7-ACA']]}],\n",
       " [{'prompt': '下文中市场价格下降导致市场价格下降事件对应的结果涉及的产品是？但由于7-ACA市场价格大幅下跌，跌幅超过50%，导致公司相关原料药销售价格随之下跌',\n",
       "   'response': [['原料药']]}],\n",
       " [{'prompt': '下文中市场价格下降导致市场价格下降事件对应的结果涉及的行业是？但由于7-ACA市场价格大幅下跌，跌幅超过50%，导致公司相关原料药销售价格随之下跌',\n",
       "   'response': [['暂不清楚，需要更多信息说明。']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？寒潮导致农产品价格继续攀升', 'response': [['寒潮导致市场价格提升']]}],\n",
       " [{'prompt': '下文中寒潮导致市场价格提升事件对应的结果涉及的产品是？寒潮导致农产品价格继续攀升',\n",
       "   'response': [['农产品']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？而发改委分别在5月10日和6月9日对国内汽柴油价格进行了下调，汽油和柴油零售价格分别累计下调了860和820元/吨，导致石化双雄（尤其是中石化）不仅炼油损逐月扩大、成品油销售受阻并且出现库存跌价',\n",
       "   'response': [['市场价格下降导致销量（消费）减少']]}],\n",
       " [{'prompt': '下文中市场价格下降导致销量（消费）减少事件对应的原因涉及的产品是？而发改委分别在5月10日和6月9日对国内汽柴油价格进行了下调，汽油和柴油零售价格分别累计下调了860和820元/吨，导致石化双雄（尤其是中石化）不仅炼油损逐月扩大、成品油销售受阻并且出现库存跌价',\n",
       "   'response': [['汽油,柴油']]}],\n",
       " [{'prompt': '下文中市场价格下降导致销量（消费）减少事件对应的结果涉及的产品是？而发改委分别在5月10日和6月9日对国内汽柴油价格进行了下调，汽油和柴油零售价格分别累计下调了860和820元/吨，导致石化双雄（尤其是中石化）不仅炼油损逐月扩大、成品油销售受阻并且出现库存跌价',\n",
       "   'response': [['成品油']]}],\n",
       " [{'prompt': '下文中市场价格下降导致销量（消费）减少事件对应的结果涉及的地区是？而发改委分别在5月10日和6月9日对国内汽柴油价格进行了下调，汽油和柴油零售价格分别累计下调了860和820元/吨，导致石化双雄（尤其是中石化）不仅炼油损逐月扩大、成品油销售受阻并且出现库存跌价',\n",
       "   'response': [['从以上文本中我们暂无发现。']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？利润和每股收益大幅下降的主要原因是煤炭价格大幅上涨，导致公司投资的火电企业的煤炭成本大幅增加###但报告期内受煤炭行业去产能改革的影响，公司煤炭前三季度平均采购价格同比上涨44.53%，导致成本涨幅抵消了销售价格的涨幅',\n",
       "   'response': [['市场价格提升导致运营成本提升']]}],\n",
       " [{'prompt': '下文中市场价格提升导致运营成本提升事件对应的原因涉及的产品是？利润和每股收益大幅下降的主要原因是煤炭价格大幅上涨，导致公司投资的火电企业的煤炭成本大幅增加###但报告期内受煤炭行业去产能改革的影响，公司煤炭前三季度平均采购价格同比上涨44.53%，导致成本涨幅抵消了销售价格的涨幅',\n",
       "   'response': [['煤炭']]}],\n",
       " [{'prompt': '下文中市场价格提升导致运营成本提升事件对应的原因涉及的行业是？利润和每股收益大幅下降的主要原因是煤炭价格大幅上涨，导致公司投资的火电企业的煤炭成本大幅增加###但报告期内受煤炭行业去产能改革的影响，公司煤炭前三季度平均采购价格同比上涨44.53%，导致成本涨幅抵消了销售价格的涨幅',\n",
       "   'response': [['以上内容暂无说明。']]}],\n",
       " [{'prompt': '下文中市场价格提升导致运营成本提升事件对应的结果涉及的产品是？利润和每股收益大幅下降的主要原因是煤炭价格大幅上涨，导致公司投资的火电企业的煤炭成本大幅增加###但报告期内受煤炭行业去产能改革的影响，公司煤炭前三季度平均采购价格同比上涨44.53%，导致成本涨幅抵消了销售价格的涨幅',\n",
       "   'response': [['煤炭']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？风险因素:国内外因天气等自然灾害导致用肥量锐减',\n",
       "   'response': [['其他自然灾害导致需求减少']]}],\n",
       " [{'prompt': '下文中其他自然灾害导致需求减少事件对应的原因涉及的产品是？风险因素:国内外因天气等自然灾害导致用肥量锐减',\n",
       "   'response': [['以上内容暂无说明。']]}],\n",
       " [{'prompt': '下文中其他自然灾害导致需求减少事件对应的原因涉及的地区是？风险因素:国内外因天气等自然灾害导致用肥量锐减',\n",
       "   'response': [['国内外']]}],\n",
       " [{'prompt': '下文中其他自然灾害导致需求减少事件对应的结果涉及的产品是？风险因素:国内外因天气等自然灾害导致用肥量锐减',\n",
       "   'response': [['肥']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨',\n",
       "   'response': [['供给减少导致市场价格提升']]}],\n",
       " [{'prompt': '下文中供给减少导致市场价格提升事件对应的原因涉及的产品是？东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨',\n",
       "   'response': [['棕榈油']]}],\n",
       " [{'prompt': '下文中供给减少导致市场价格提升事件对应的原因涉及的地区是？东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨',\n",
       "   'response': [['东南亚']]}],\n",
       " [{'prompt': '下文中供给减少导致市场价格提升事件对应的结果涉及的产品是？东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨',\n",
       "   'response': [['原油']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？南美大豆供求宽松对豆类油脂价格利空', 'response': [['供给增加导致负向影响']]}],\n",
       " [{'prompt': '下文中供给增加导致负向影响事件对应的原因涉及的产品是？南美大豆供求宽松对豆类油脂价格利空',\n",
       "   'response': [['大豆']]}],\n",
       " [{'prompt': '下文中供给增加导致负向影响事件对应的原因涉及的地区是？南美大豆供求宽松对豆类油脂价格利空',\n",
       "   'response': [['南美']]}],\n",
       " [{'prompt': '下文中供给增加导致负向影响事件对应的结果涉及的产品是？南美大豆供求宽松对豆类油脂价格利空',\n",
       "   'response': [['豆类油脂']]}],\n",
       " [{'prompt': '下文中需求增加导致市场价格提升事件对应的原因涉及的产品是？尿素：随着天气转暖，春耕、北方小麦返青肥、南方水稻用肥需求增加，价格稳中上涨',\n",
       "   'response': [['小麦返青肥,水稻用肥']]}],\n",
       " [{'prompt': '下文中供给增加导致负向影响事件对应的原因涉及的地区是？南美大豆供求宽松对豆类油脂价格利空',\n",
       "   'response': [['南美']]}],\n",
       " [{'prompt': '下文中需求减少导致市场价格下降事件对应的原因涉及的行业是？随着国内外进入消费淡季，近期国内钢价持续下跌，预计3季度国内钢价将维持弱势，但受成本支撑，国内钢价难以大幅下跌，基本维持高位运行。2009年后，国内钢铁行业盈利将有小幅下降，因为需求减弱导致钢价下跌而成本保持稳定，基于钢价的下跌和新区盈利的改善，马钢下半年盈利吨钢将有小幅下降，我们预计公司2008年将实现净利润41.2亿元，假设年末马钢权证全部行权，预计2008马钢可实现每股收益0.53元，目前股价下对应8倍2008P/E和1.1倍2008P/B，维持增持评级',\n",
       "   'response': [['钢铁行业']]}],\n",
       " [{'prompt': '下文中市场价格下降导致市场价格下降事件对应的原因涉及的产品是？在国际奶粉价格下降压力下,国内奶价仍有下降空间',\n",
       "   'response': [['奶粉']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？但由于7-ACA市场价格大幅下跌，跌幅超过50%，导致公司相关原料药销售价格随之下跌',\n",
       "   'response': [['市场价格下降导致市场价格下降']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？南美大豆供求宽松对豆类油脂价格利空', 'response': [['供给增加导致负向影响']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？寿光受洪水影响，菜价禽价上涨，继续关注通胀机会，重点推荐禽链',\n",
       "   'response': [['洪涝导致市场价格提升']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？10月份，美元贬值成为基本金属价格上涨的主要的动力',\n",
       "   'response': [['市场价格下降导致市场价格提升']]}],\n",
       " [{'prompt': '下文中供给减少导致市场价格提升事件对应的原因涉及的行业是？浮法玻璃价格上涨、太阳能板块景气度回升是业绩预增的主因2013年年初至今,浮法玻璃行业受益于需求的逐步回升、停产冷修带来的供给端收缩,浮法玻璃格持续上涨,预计年初至今公司的浮法玻璃价格涨幅近10%,因此浮法玻璃业务的盈利大幅提升',\n",
       "   'response': [['浮法玻璃行业']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？风险因素：光棒供给恢复导致光纤价格回落、成本提高压制毛利率',\n",
       "   'response': [['供给增加导致市场价格下降']]}],\n",
       " [{'prompt': '下文中限产导致供给减少事件对应的原因涉及的产品是？而炼焦煤方面，近期虽然处于焦煤消费淡季，但由于钢铁产量仍处于高位，各地停产限产政策都炼焦煤给影响较大，预计炼焦煤供需短期仍处于偏紧状态，焦煤价格仍较坚挺###而炼焦煤方面,近期钢铁产业链需求有所收缩,但由于各地停产限产政策对炼焦煤给影响较大,预计炼焦煤供需短期仍处于偏紧状态,焦煤价格仍较坚挺###而炼焦煤方面，近期钢铁产业链需求有所收缩，但由于各地停产限产政策都炼焦煤给影响较大，预计炼焦煤供需短期仍处于偏紧状态，焦煤价格仍较坚挺###焦煤方面，目前限产政策仍在执行，虽有部分地区煤矿超产，但由于目前焦化厂仍有50-100元/吨的利润，其开工率仍保持在高位，对炼焦煤求尚未明显减少，因此整体炼焦煤资源略显紧张',\n",
       "   'response': [['暂不清楚，需要更多信息说明。']]}],\n",
       " [{'prompt': '下文中供给减少导致市场价格提升事件对应的结果涉及的产品是？除了越南暂停大米出口外，哈萨克斯坦、塞尔维亚、埃及等多地均发布农产品临时出口禁令，供弱需强带动国际大米及小麦近期价格上涨，越南大米到岸价已创2018年8月份以来新高，国际小麦价格较去年同期上涨16%',\n",
       "   'response': [['大米,小麦']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？除了越南暂停大米出口外，哈萨克斯坦、塞尔维亚、埃及等多地均发布农产品临时出口禁令，供弱需强带动国际大米及小麦近期价格上涨，越南大米到岸价已创2018年8月份以来新高，国际小麦价格较去年同期上涨16%',\n",
       "   'response': [['供给减少导致市场价格提升;需求增加导致市场价格提升']]}],\n",
       " [{'prompt': '下文中需求增加导致销量（消费）增加事件对应的原因涉及的产品是？82%），主要由于电厂天然气需求量增长，带动公司天然气量同比增长25',\n",
       "   'response': [['天然气']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？4%，澳协认为目前干旱将导致牧草供给下降及奶牛养殖成本攀升，判断澳大利亚下一季度原奶维持减势，同比降幅预计为3-5%，总产量降至83-85亿升',\n",
       "   'response': [['干旱导致供给减少;干旱导致运营成本提升']]}],\n",
       " [{'prompt': '下文中负向影响导致市场价格下降事件对应的原因涉及的产品是？双甘膦价格受下游草甘膦低迷影响较为疲软,公司根据市场情况逐步提高开工负荷,产量不达此前预期,但上半年公司双甘膦毛利率仍达到19.65%,盈利能力良好',\n",
       "   'response': [['草甘膦']]}],\n",
       " [{'prompt': '下文中供给减少导致供给减少事件对应的原因涉及的产品是？由于今年8月份以前育雏鸡补栏量持续减少，进而导致后期2个月的青年鸡存栏量减少###由于今年8月份以前育雏鸡补栏量持续减少，进而导致后期2个月的青年鸡存栏量减少',\n",
       "   'response': [['育雏鸡']]}],\n",
       " [{'prompt': '下文中市场价格提升导致需求减少事件对应的原因涉及的产品是？需求端：我国生猪产能也在不断恢复，饲料需求量也在不断增加，但随着玉米高价的影响，部分饲料企业已经将部分原料从玉米转变成了小麦，而这将导致玉米需求量下降###我国生猪产能也在不断恢复，饲料求量也在不断增加，但随着玉米高价的影响，部分饲料企业已经将部分原料从玉米转变成了小麦，而这将导致玉米需求量下降###需求端：我国生猪产能也在不断恢复，饲料求量也在不断增加，但随着玉米高价的影响，部分饲料企业已经将部分原料从玉米转变成了小麦，而这将导致玉米需求量下降###需求端：7月份，生猪存栏量环比首次正增长，我国生猪产能也在不断恢复，饲料求量也在不断增加，但随着玉米高价的影响，部分饲料企业已经将部分原料从玉米转变成了小麦，而这将导致玉米需求量下降###需求端：7月份，生猪存栏量环比首次正增长，我国生猪产能也在不断恢复，饲料求量也在不断增加，但随着玉米高价的影响，部分饲料企业已经将部分原料从玉米转变成了小麦，而这将导致玉米需求量下降',\n",
       "   'response': [['玉米']]}],\n",
       " [{'prompt': '下文中市场价格下降导致市场价格下降事件对应的结果涉及的产品是？在国际奶粉价格下降压力下,国内奶价仍有下降空间',\n",
       "   'response': [['奶']]}],\n",
       " [{'prompt': '下文中市场价格下降导致市场价格下降事件对应的结果涉及的地区是？在国际奶粉价格下降压力下,国内奶价仍有下降空间',\n",
       "   'response': [['国内']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？随着供应缩减，需求回升，高碳铬铁价格企稳回升，预期短期内或仍维持偏强态势，但高碳铬铁产能依然偏向过剩，后期价格弹升空间有限',\n",
       "   'response': [['需求增加导致市场价格提升']]}],\n",
       " [{'prompt': '下文中需求增加导致市场价格提升事件对应的原因涉及的产品是？随着供应缩减，需求回升，高碳铬铁价格企稳回升，预期短期内或仍维持偏强态势，但高碳铬铁产能依然偏向过剩，后期价格弹升空间有限',\n",
       "   'response': [['高碳铬铁']]}],\n",
       " [{'prompt': '下文中需求增加导致市场价格提升事件对应的原因涉及的行业是？随着供应缩减，需求回升，高碳铬铁价格企稳回升，预期短期内或仍维持偏强态势，但高碳铬铁产能依然偏向过剩，后期价格弹升空间有限',\n",
       "   'response': [['我们无法得知，可能需要更多内容说明。']]}],\n",
       " [{'prompt': '下文中需求增加导致市场价格提升事件对应的结果涉及的产品是？随着供应缩减，需求回升，高碳铬铁价格企稳回升，预期短期内或仍维持偏强态势，但高碳铬铁产能依然偏向过剩，后期价格弹升空间有限',\n",
       "   'response': [['高碳铬铁']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？但由于7-ACA市场价格大幅下跌，跌幅超过50%，导致公司相关原料药销售价格随之下跌',\n",
       "   'response': [['市场价格下降导致市场价格下降']]}],\n",
       " [{'prompt': '下文中市场价格下降导致市场价格下降事件对应的原因涉及的产品是？但由于7-ACA市场价格大幅下跌，跌幅超过50%，导致公司相关原料药销售价格随之下跌',\n",
       "   'response': [['7-ACA']]}],\n",
       " [{'prompt': '下文中市场价格下降导致市场价格下降事件对应的结果涉及的产品是？但由于7-ACA市场价格大幅下跌，跌幅超过50%，导致公司相关原料药销售价格随之下跌',\n",
       "   'response': [['原料药']]}],\n",
       " [{'prompt': '下文中市场价格下降导致市场价格下降事件对应的结果涉及的行业是？但由于7-ACA市场价格大幅下跌，跌幅超过50%，导致公司相关原料药销售价格随之下跌',\n",
       "   'response': [['暂不清楚，需要更多信息说明。']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？寒潮导致农产品价格继续攀升', 'response': [['寒潮导致市场价格提升']]}],\n",
       " [{'prompt': '下文中寒潮导致市场价格提升事件对应的结果涉及的产品是？寒潮导致农产品价格继续攀升',\n",
       "   'response': [['农产品']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？而发改委分别在5月10日和6月9日对国内汽柴油价格进行了下调，汽油和柴油零售价格分别累计下调了860和820元/吨，导致石化双雄（尤其是中石化）不仅炼油损逐月扩大、成品油销售受阻并且出现库存跌价',\n",
       "   'response': [['市场价格下降导致销量（消费）减少']]}],\n",
       " [{'prompt': '下文中市场价格下降导致销量（消费）减少事件对应的原因涉及的产品是？而发改委分别在5月10日和6月9日对国内汽柴油价格进行了下调，汽油和柴油零售价格分别累计下调了860和820元/吨，导致石化双雄（尤其是中石化）不仅炼油损逐月扩大、成品油销售受阻并且出现库存跌价',\n",
       "   'response': [['汽油,柴油']]}],\n",
       " [{'prompt': '下文中市场价格下降导致销量（消费）减少事件对应的结果涉及的产品是？而发改委分别在5月10日和6月9日对国内汽柴油价格进行了下调，汽油和柴油零售价格分别累计下调了860和820元/吨，导致石化双雄（尤其是中石化）不仅炼油损逐月扩大、成品油销售受阻并且出现库存跌价',\n",
       "   'response': [['成品油']]}],\n",
       " [{'prompt': '下文中市场价格下降导致销量（消费）减少事件对应的结果涉及的地区是？而发改委分别在5月10日和6月9日对国内汽柴油价格进行了下调，汽油和柴油零售价格分别累计下调了860和820元/吨，导致石化双雄（尤其是中石化）不仅炼油损逐月扩大、成品油销售受阻并且出现库存跌价',\n",
       "   'response': [['从以上文本中我们暂无发现。']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？利润和每股收益大幅下降的主要原因是煤炭价格大幅上涨，导致公司投资的火电企业的煤炭成本大幅增加###但报告期内受煤炭行业去产能改革的影响，公司煤炭前三季度平均采购价格同比上涨44.53%，导致成本涨幅抵消了销售价格的涨幅',\n",
       "   'response': [['市场价格提升导致运营成本提升']]}],\n",
       " [{'prompt': '下文中市场价格提升导致运营成本提升事件对应的原因涉及的产品是？利润和每股收益大幅下降的主要原因是煤炭价格大幅上涨，导致公司投资的火电企业的煤炭成本大幅增加###但报告期内受煤炭行业去产能改革的影响，公司煤炭前三季度平均采购价格同比上涨44.53%，导致成本涨幅抵消了销售价格的涨幅',\n",
       "   'response': [['煤炭']]}],\n",
       " [{'prompt': '下文中市场价格提升导致运营成本提升事件对应的原因涉及的行业是？利润和每股收益大幅下降的主要原因是煤炭价格大幅上涨，导致公司投资的火电企业的煤炭成本大幅增加###但报告期内受煤炭行业去产能改革的影响，公司煤炭前三季度平均采购价格同比上涨44.53%，导致成本涨幅抵消了销售价格的涨幅',\n",
       "   'response': [['以上内容暂无说明。']]}],\n",
       " [{'prompt': '下文中市场价格提升导致运营成本提升事件对应的结果涉及的产品是？利润和每股收益大幅下降的主要原因是煤炭价格大幅上涨，导致公司投资的火电企业的煤炭成本大幅增加###但报告期内受煤炭行业去产能改革的影响，公司煤炭前三季度平均采购价格同比上涨44.53%，导致成本涨幅抵消了销售价格的涨幅',\n",
       "   'response': [['煤炭']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？风险因素:国内外因天气等自然灾害导致用肥量锐减',\n",
       "   'response': [['其他自然灾害导致需求减少']]}],\n",
       " [{'prompt': '下文中其他自然灾害导致需求减少事件对应的原因涉及的产品是？风险因素:国内外因天气等自然灾害导致用肥量锐减',\n",
       "   'response': [['以上内容暂无说明。']]}],\n",
       " [{'prompt': '下文中其他自然灾害导致需求减少事件对应的原因涉及的地区是？风险因素:国内外因天气等自然灾害导致用肥量锐减',\n",
       "   'response': [['国内外']]}],\n",
       " [{'prompt': '下文中其他自然灾害导致需求减少事件对应的结果涉及的产品是？风险因素:国内外因天气等自然灾害导致用肥量锐减',\n",
       "   'response': [['肥']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨',\n",
       "   'response': [['供给减少导致市场价格提升']]}],\n",
       " [{'prompt': '下文中供给减少导致市场价格提升事件对应的原因涉及的产品是？东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨',\n",
       "   'response': [['棕榈油']]}],\n",
       " [{'prompt': '下文中供给减少导致市场价格提升事件对应的原因涉及的地区是？东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨',\n",
       "   'response': [['东南亚']]}],\n",
       " [{'prompt': '下文中供给减少导致市场价格提升事件对应的结果涉及的产品是？东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨',\n",
       "   'response': [['原油']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？南美大豆供求宽松对豆类油脂价格利空', 'response': [['供给增加导致负向影响']]}],\n",
       " [{'prompt': '下文中供给增加导致负向影响事件对应的原因涉及的产品是？南美大豆供求宽松对豆类油脂价格利空',\n",
       "   'response': [['大豆']]}],\n",
       " [{'prompt': '下文中供给增加导致负向影响事件对应的原因涉及的地区是？南美大豆供求宽松对豆类油脂价格利空',\n",
       "   'response': [['南美']]}],\n",
       " [{'prompt': '下文中供给增加导致负向影响事件对应的结果涉及的产品是？南美大豆供求宽松对豆类油脂价格利空',\n",
       "   'response': [['豆类油脂']]}],\n",
       " [{'prompt': '下文中需求增加导致市场价格提升事件对应的原因涉及的产品是？尿素：随着天气转暖，春耕、北方小麦返青肥、南方水稻用肥需求增加，价格稳中上涨',\n",
       "   'response': [['小麦返青肥,水稻用肥']]}],\n",
       " [{'prompt': '下文中供给增加导致负向影响事件对应的原因涉及的地区是？南美大豆供求宽松对豆类油脂价格利空',\n",
       "   'response': [['南美']]}],\n",
       " [{'prompt': '下文中需求减少导致市场价格下降事件对应的原因涉及的行业是？随着国内外进入消费淡季，近期国内钢价持续下跌，预计3季度国内钢价将维持弱势，但受成本支撑，国内钢价难以大幅下跌，基本维持高位运行。2009年后，国内钢铁行业盈利将有小幅下降，因为需求减弱导致钢价下跌而成本保持稳定，基于钢价的下跌和新区盈利的改善，马钢下半年盈利吨钢将有小幅下降，我们预计公司2008年将实现净利润41.2亿元，假设年末马钢权证全部行权，预计2008马钢可实现每股收益0.53元，目前股价下对应8倍2008P/E和1.1倍2008P/B，维持增持评级',\n",
       "   'response': [['钢铁行业']]}],\n",
       " [{'prompt': '下文中市场价格下降导致市场价格下降事件对应的原因涉及的产品是？在国际奶粉价格下降压力下,国内奶价仍有下降空间',\n",
       "   'response': [['奶粉']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？但由于7-ACA市场价格大幅下跌，跌幅超过50%，导致公司相关原料药销售价格随之下跌',\n",
       "   'response': [['市场价格下降导致市场价格下降']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？南美大豆供求宽松对豆类油脂价格利空', 'response': [['供给增加导致负向影响']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？寿光受洪水影响，菜价禽价上涨，继续关注通胀机会，重点推荐禽链',\n",
       "   'response': [['洪涝导致市场价格提升']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？10月份，美元贬值成为基本金属价格上涨的主要的动力',\n",
       "   'response': [['市场价格下降导致市场价格提升']]}],\n",
       " [{'prompt': '下文中供给减少导致市场价格提升事件对应的原因涉及的行业是？浮法玻璃价格上涨、太阳能板块景气度回升是业绩预增的主因2013年年初至今,浮法玻璃行业受益于需求的逐步回升、停产冷修带来的供给端收缩,浮法玻璃格持续上涨,预计年初至今公司的浮法玻璃价格涨幅近10%,因此浮法玻璃业务的盈利大幅提升',\n",
       "   'response': [['浮法玻璃行业']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？风险因素：光棒供给恢复导致光纤价格回落、成本提高压制毛利率',\n",
       "   'response': [['供给增加导致市场价格下降']]}],\n",
       " [{'prompt': '下文中限产导致供给减少事件对应的原因涉及的产品是？而炼焦煤方面，近期虽然处于焦煤消费淡季，但由于钢铁产量仍处于高位，各地停产限产政策都炼焦煤给影响较大，预计炼焦煤供需短期仍处于偏紧状态，焦煤价格仍较坚挺###而炼焦煤方面,近期钢铁产业链需求有所收缩,但由于各地停产限产政策对炼焦煤给影响较大,预计炼焦煤供需短期仍处于偏紧状态,焦煤价格仍较坚挺###而炼焦煤方面，近期钢铁产业链需求有所收缩，但由于各地停产限产政策都炼焦煤给影响较大，预计炼焦煤供需短期仍处于偏紧状态，焦煤价格仍较坚挺###焦煤方面，目前限产政策仍在执行，虽有部分地区煤矿超产，但由于目前焦化厂仍有50-100元/吨的利润，其开工率仍保持在高位，对炼焦煤求尚未明显减少，因此整体炼焦煤资源略显紧张',\n",
       "   'response': [['暂不清楚，需要更多信息说明。']]}],\n",
       " [{'prompt': '下文中供给减少导致市场价格提升事件对应的结果涉及的产品是？除了越南暂停大米出口外，哈萨克斯坦、塞尔维亚、埃及等多地均发布农产品临时出口禁令，供弱需强带动国际大米及小麦近期价格上涨，越南大米到岸价已创2018年8月份以来新高，国际小麦价格较去年同期上涨16%',\n",
       "   'response': [['大米,小麦']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？除了越南暂停大米出口外，哈萨克斯坦、塞尔维亚、埃及等多地均发布农产品临时出口禁令，供弱需强带动国际大米及小麦近期价格上涨，越南大米到岸价已创2018年8月份以来新高，国际小麦价格较去年同期上涨16%',\n",
       "   'response': [['供给减少导致市场价格提升;需求增加导致市场价格提升']]}],\n",
       " [{'prompt': '下文中需求增加导致销量（消费）增加事件对应的原因涉及的产品是？82%），主要由于电厂天然气需求量增长，带动公司天然气量同比增长25',\n",
       "   'response': [['天然气']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？4%，澳协认为目前干旱将导致牧草供给下降及奶牛养殖成本攀升，判断澳大利亚下一季度原奶维持减势，同比降幅预计为3-5%，总产量降至83-85亿升',\n",
       "   'response': [['干旱导致供给减少;干旱导致运营成本提升']]}],\n",
       " [{'prompt': '下文中负向影响导致市场价格下降事件对应的原因涉及的产品是？双甘膦价格受下游草甘膦低迷影响较为疲软,公司根据市场情况逐步提高开工负荷,产量不达此前预期,但上半年公司双甘膦毛利率仍达到19.65%,盈利能力良好',\n",
       "   'response': [['草甘膦']]}],\n",
       " [{'prompt': '下文中供给减少导致供给减少事件对应的原因涉及的产品是？由于今年8月份以前育雏鸡补栏量持续减少，进而导致后期2个月的青年鸡存栏量减少###由于今年8月份以前育雏鸡补栏量持续减少，进而导致后期2个月的青年鸡存栏量减少',\n",
       "   'response': [['育雏鸡']]}],\n",
       " [{'prompt': '下文中市场价格提升导致需求减少事件对应的原因涉及的产品是？需求端：我国生猪产能也在不断恢复，饲料需求量也在不断增加，但随着玉米高价的影响，部分饲料企业已经将部分原料从玉米转变成了小麦，而这将导致玉米需求量下降###我国生猪产能也在不断恢复，饲料求量也在不断增加，但随着玉米高价的影响，部分饲料企业已经将部分原料从玉米转变成了小麦，而这将导致玉米需求量下降###需求端：我国生猪产能也在不断恢复，饲料求量也在不断增加，但随着玉米高价的影响，部分饲料企业已经将部分原料从玉米转变成了小麦，而这将导致玉米需求量下降###需求端：7月份，生猪存栏量环比首次正增长，我国生猪产能也在不断恢复，饲料求量也在不断增加，但随着玉米高价的影响，部分饲料企业已经将部分原料从玉米转变成了小麦，而这将导致玉米需求量下降###需求端：7月份，生猪存栏量环比首次正增长，我国生猪产能也在不断恢复，饲料求量也在不断增加，但随着玉米高价的影响，部分饲料企业已经将部分原料从玉米转变成了小麦，而这将导致玉米需求量下降',\n",
       "   'response': [['玉米']]}],\n",
       " [{'prompt': '下文中有哪些因果事件？除了越南暂停大米出口外，哈萨克斯坦、塞尔维亚、埃及等多地均发布农产品临时出口禁令，供弱需强带动国际大米及小麦近期价格上涨，越南大米到岸价已创2018年8月份以来新高，国际小麦价格较去年同期上涨16%',\n",
       "   'response': [['供给减少导致市场价格提升;需求增加导致市场价格提升']]}]]"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from qianfan.dataset import Dataset\n",
    "\n",
    "# 加载本地数据集\n",
    "ds: Dataset = Dataset.load(data_file=\"./data/fin_cqa_train.jsonl\")\n",
    "ds.list()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "从本地数据集上传到BOS"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "[INFO][2024-06-14 10:11:43.137] baidu_qianfan.py:465 [t:8344509248]: start to create dataset on qianfan\n",
      "[INFO][2024-06-14 10:11:44.218] baidu_qianfan.py:483 [t:8344509248]: create dataset on qianfan successfully\n",
      "[INFO][2024-06-14 10:11:44.220] schema.py:36 [t:8344509248]: unpack dataset before validating\n",
      "[INFO][2024-06-14 10:11:44.223] dataset.py:994 [t:8344509248]: list local dataset data by 0\n",
      "[INFO][2024-06-14 10:11:44.817] utils.py:465 [t:8344509248]: start to write arrow table to .qf_cache/dataset/.mapper_cache/Users/zhonghanjun/pywp/bce-qianfan-sdk/cookbook/finetune/data/fin_cqa_train_23581f31-9ada-477e-b9c2-1093517f2642.arrow\n",
      "[INFO][2024-06-14 10:11:44.821] utils.py:481 [t:8344509248]: writing succeeded\n",
      "[INFO][2024-06-14 10:11:44.821] utils.py:349 [t:8344509248]: start to get memory_map from .qf_cache/dataset/.mapper_cache/Users/zhonghanjun/pywp/bce-qianfan-sdk/cookbook/finetune/data/fin_cqa_train_23581f31-9ada-477e-b9c2-1093517f2642.arrow\n",
      "[INFO][2024-06-14 10:11:44.822] schema.py:39 [t:8344509248]: pack dataset after validation\n",
      "[INFO][2024-06-14 10:11:44.827] utils.py:465 [t:8344509248]: start to write arrow table to .qf_cache/dataset/.mapper_cache/Users/zhonghanjun/pywp/bce-qianfan-sdk/cookbook/finetune/data/fin_cqa_train_0a6c962c-6b6f-429d-85b4-be7a07a13152.arrow\n",
      "[INFO][2024-06-14 10:11:44.831] utils.py:481 [t:8344509248]: writing succeeded\n",
      "[INFO][2024-06-14 10:11:44.831] utils.py:349 [t:8344509248]: start to get memory_map from .qf_cache/dataset/.mapper_cache/Users/zhonghanjun/pywp/bce-qianfan-sdk/cookbook/finetune/data/fin_cqa_train_0a6c962c-6b6f-429d-85b4-be7a07a13152.arrow\n",
      "[INFO][2024-06-14 10:11:44.833] dataset.py:994 [t:8344509248]: list local dataset data by slice(0, 9999, None)\n",
      "[INFO][2024-06-14 10:11:44.835] baidu_qianfan.py:251 [t:8344509248]: start to upload data to user BOS\n",
      "[INFO][2024-06-14 10:11:44.836] baidu_qianfan.py:260 [t:8344509248]: upload dataset file .qf_cache/dataset/.qianfan_download_cache/dg-dqymbme3e8vrfc55/ds-wy4hmd811aeh2b3p/1/data_26c80de4-e3ac-4f22-84e6-1d066ba7e543.jsonl to /sdk_ds/data_26c80de4-e3ac-4f22-84e6-1d066ba7e543.jsonl\n",
      "[INFO][2024-06-14 10:11:45.051] baidu_qianfan.py:263 [t:8344509248]: uploading data to user BOS finished\n",
      "[INFO][2024-06-14 10:11:46.129] utils.py:577 [t:8344509248]: successfully create importing task\n",
      "[INFO][2024-06-14 10:11:48.135] utils.py:580 [t:8344509248]: polling import task status\n",
      "[INFO][2024-06-14 10:11:48.619] utils.py:587 [t:8344509248]: import status: 1, keep polling\n",
      "[INFO][2024-06-14 10:11:50.624] utils.py:580 [t:8344509248]: polling import task status\n",
      "[INFO][2024-06-14 10:11:51.283] utils.py:587 [t:8344509248]: import status: 1, keep polling\n",
      "[INFO][2024-06-14 10:11:53.286] utils.py:580 [t:8344509248]: polling import task status\n",
      "[INFO][2024-06-14 10:11:53.652] utils.py:587 [t:8344509248]: import status: 1, keep polling\n",
      "[INFO][2024-06-14 10:11:55.658] utils.py:580 [t:8344509248]: polling import task status\n",
      "[INFO][2024-06-14 10:11:56.136] utils.py:590 [t:8344509248]: import succeed\n"
     ]
    }
   ],
   "source": [
    "# 保存到千帆平台\n",
    "from qianfan.dataset.data_source import QianfanDataSource\n",
    "from qianfan.resources.console import consts as console_consts\n",
    "\n",
    "bos_bucket_name = \"your_bucket_name\"\n",
    "bos_bucket_file_path = \"/sdk_ds/\"\n",
    "qianfan_dataset_name = \"random_sdk_trainer_ds\"\n",
    "\n",
    "# 创建千帆数据集，并上传保存\n",
    "qianfan_data_source = QianfanDataSource.create_bare_dataset(\n",
    "    name=qianfan_dataset_name,\n",
    "    template_type=console_consts.DataTemplateType.NonSortedConversation,\n",
    "    storage_type=console_consts.DataStorageType.PrivateBos,\n",
    "    storage_id=bos_bucket_name,\n",
    "    storage_path=bos_bucket_file_path,\n",
    ")\n",
    "\n",
    "ds = ds.save(qianfan_data_source)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### LLMFinetune 训练\n",
    "`LLMFinetune` 实现了SFT逻辑的trainer，它内部组装了SFT所需要的基本`Pipeline`, 用于串联数据->训练->模型发布->服务调用等步骤"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [],
   "source": [
    "from qianfan.trainer.consts import PeftType\n",
    "\n",
    "trainer = Finetune(\n",
    "    train_type=\"ERNIE-Speed-8K\",\n",
    "    train_config=TrainConfig(\n",
    "        epoch=1,\n",
    "        learning_rate=0.0003,\n",
    "        max_seq_len=4096,\n",
    "        peft_type=PeftType.LoRA,\n",
    "        logging_steps=1,\n",
    "        warmup_ratio=0.10,\n",
    "        weight_decay=0.0100,\n",
    "        lora_rank=8,\n",
    "        lora_all_linear=\"True\",\n",
    "    ),\n",
    "    dataset=DatasetConfig(\n",
    "        datasets=[ds],\n",
    "        eval_split_ratio=10,    # 评估集拆分比例 10%\n",
    "        corpus_proportion=0.03, # 混合千帆通用训练语料 0.03%\n",
    "    ),\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 运行任务\n",
    "同步运行trainer，训练直到模型发布完成"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "[INFO][2024-06-14 10:12:56.543] utils.py:776 [t:6280982528]: data releasing, keep polling\n",
      "[INFO][2024-06-14 10:12:59.069] utils.py:776 [t:6280982528]: data releasing, keep polling\n",
      "[INFO][2024-06-14 10:13:01.529] utils.py:776 [t:6280982528]: data releasing, keep polling\n",
      "[INFO][2024-06-14 10:13:03.986] utils.py:776 [t:6280982528]: data releasing, keep polling\n",
      "[INFO][2024-06-14 10:13:06.441] utils.py:776 [t:6280982528]: data releasing, keep polling\n",
      "[INFO][2024-06-14 10:13:08.896] utils.py:783 [t:6280982528]: data releasing succeeded\n",
      "[INFO][2024-06-14 10:13:11.868] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 1% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:13:42.494] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 1% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:14:12.930] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 3% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:14:43.429] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 3% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:15:13.924] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 34% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:15:44.363] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 34% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:16:14.891] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 34% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:16:45.425] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 34% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:17:15.884] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 34% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:17:46.378] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 34% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:18:16.917] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 34% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:18:47.434] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 34% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:19:17.877] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 34% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:19:48.372] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 34% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:20:18.981] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:20:18.984] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:20:49.595] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:20:49.597] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:21:20.176] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:21:20.177] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:21:50.766] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:21:50.768] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:22:21.333] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:22:21.334] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:22:51.950] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:22:51.951] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:23:22.466] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:23:22.469] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:23:53.080] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:23:53.081] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:24:23.597] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:24:23.598] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:24:54.047] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:24:54.050] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:25:24.659] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:25:24.678] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:25:55.293] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:25:55.295] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:26:25.792] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:26:25.795] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:26:56.272] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:26:56.273] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:27:26.752] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:27:26.755] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:27:57.403] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:27:57.404] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:28:27.917] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:28:27.918] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:28:58.434] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:28:58.437] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:29:28.893] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:29:28.895] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:29:59.395] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:29:59.396] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:30:29.890] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:30:29.893] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:31:00.377] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:31:00.379] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:31:30.883] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:31:30.885] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:32:01.439] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:32:01.443] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:32:31.920] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:32:31.922] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:33:02.482] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:33:02.488] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:33:33.079] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:33:33.080] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:34:03.622] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:34:03.624] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:34:34.188] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:34:34.191] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:35:04.659] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:35:04.660] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:35:35.138] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:35:35.141] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:36:05.593] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:36:05.595] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:36:36.044] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:36:36.046] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:37:06.606] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:37:06.609] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:37:37.065] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:37:37.067] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:38:07.553] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:38:07.554] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:38:38.018] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:38:38.020] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:39:08.482] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:39:08.483] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:39:38.947] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:39:38.948] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:40:09.452] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:40:09.455] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:40:39.948] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:40:39.949] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:41:10.387] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:41:10.390] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:41:40.964] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:41:40.969] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:42:11.581] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:42:11.582] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:42:42.137] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:42:42.140] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:43:12.576] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:43:12.578] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:43:42.999] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:43:43.000] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:44:13.558] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:44:13.560] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:44:44.039] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:44:44.042] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:45:14.540] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:45:14.544] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:45:45.083] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 65% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:45:45.084] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:46:15.749] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 96% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:46:15.752] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:46:46.248] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 99% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:46:46.250] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:47:16.738] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 99% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:47:16.739] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:47:47.160] actions.py:663 [t:6280982528]: [train_action] training ... job_name:model0f228692_BxBbC current status: Running, 99% check train task log in https://console.bce.baidu.com/qianfan/train/sft/job-5z1x5c2ecmtx/task-ab2597tkcfud/detail/traininglog\n",
      "[INFO][2024-06-14 10:47:47.167] actions.py:670 [t:6280982528]:  check vdl report in https://console.bce.baidu.com/qianfan/visualdl/index?displayToken=eyJydW5JZCI6InJ1bi1ka2tpMDR2aGd2NHcxM2RyIn0=\n",
      "[INFO][2024-06-14 10:48:17.418] actions.py:638 [t:6280982528]: [train_action] training task metrics: {'BLEU-4': '1.98%', 'ROUGE-1': '7.41%', 'ROUGE-2': '0.37%', 'ROUGE-L': '5.80%'}\n",
      "[INFO][2024-06-14 10:48:17.419] actions.py:639 [t:6280982528]: [train_action] training task checkpoints: []\n",
      "[INFO][2024-06-14 10:48:17.420] actions.py:677 [t:6280982528]: [train_action] training job has ended: job-5z1x5c2ecmtx/task-ab2597tkcfud with status: Done\n",
      "[WARNING][2024-06-14 10:48:17.424] model.py:95 [t:6280982528]: model id or version_id should be provided\n",
      "[INFO][2024-06-14 10:48:17.426] model.py:222 [t:6280982528]: check train job: task-ab2597tkcfud/job-5z1x5c2ecmtx status before publishing model\n",
      "[INFO][2024-06-14 10:48:17.663] model.py:235 [t:6280982528]: model publishing keep polling, current status Done\n",
      "[INFO][2024-06-14 10:48:19.154] model.py:273 [t:6280982528]: publishing train task: job-5z1x5c2ecmtx/task-ab2597tkcfud to model: am-eush4fhk3ccb/amv-8sgyr4aqptvs\n",
      "[INFO][2024-06-14 10:48:49.961] model.py:298 [t:6280982528]: model am-eush4fhk3ccb/amv-8sgyr4aqptvs published successfully\n",
      "[INFO][2024-06-14 10:48:49.965] model.py:278 [t:6280982528]: publish successfully to model: am-eush4fhk3ccb/amv-8sgyr4aqptvs\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "<qianfan.trainer.finetune.Finetune at 0x1482c41d0>"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "trainer.run()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "获取finetune任务输出："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'datasets': {'sourceType': 'Platform',\n",
       "  'versions': [{'versionId': 'ds-wy4hmd811aeh2b3p'}],\n",
       "  'splitRatio': 10.0,\n",
       "  'corpusProportion': '0.03%'},\n",
       " 'task_id': 'task-ab2597tkcfud',\n",
       " 'job_id': 'job-5z1x5c2ecmtx',\n",
       " 'metrics': {'BLEU-4': '1.98%',\n",
       "  'ROUGE-1': '7.41%',\n",
       "  'ROUGE-2': '0.37%',\n",
       "  'ROUGE-L': '5.80%'},\n",
       " 'checkpoints': [],\n",
       " 'model_id': 'am-eush4fhk3ccb',\n",
       " 'model_version_id': 'amv-8sgyr4aqptvs',\n",
       " 'model': <qianfan.model.model.Model at 0x12d8f1bd0>}"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "trainer.output"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 运行批量评估推理\n",
    "Model支持模型批量运行评估数据集，并保存到千帆平台"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "调用Fine-tune得到的Model对象的`batch_run_on_qianfan`发起批量任务，这可能会持续数十分钟"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "[INFO][2024-06-14 11:04:03.995] dataset.py:408 [t:8344509248]: no data source was provided, construct\n",
      "[INFO][2024-06-14 11:04:03.997] dataset.py:282 [t:8344509248]: construct a qianfan data source from existed id: ds-wy4hmd811aeh2b3p, with args: {}\n",
      "[INFO][2024-06-14 11:04:04.879] dataset_utils.py:410 [t:8344509248]: start to create evaluation task in model\n",
      "[INFO][2024-06-14 11:04:05.788] dataset_utils.py:372 [t:8344509248]: start to polling status of evaluation task ame-txnmnuqczdzx\n",
      "[INFO][2024-06-14 11:04:06.068] dataset_utils.py:379 [t:8344509248]: current eval_state: Pending\n",
      "[INFO][2024-06-14 11:04:36.344] dataset_utils.py:379 [t:8344509248]: current eval_state: Doing\n",
      "[INFO][2024-06-14 11:05:06.676] dataset_utils.py:379 [t:8344509248]: current eval_state: Doing\n",
      "[INFO][2024-06-14 11:05:36.964] dataset_utils.py:379 [t:8344509248]: current eval_state: Doing\n",
      "[INFO][2024-06-14 11:06:07.290] dataset_utils.py:379 [t:8344509248]: current eval_state: Doing\n",
      "[INFO][2024-06-14 11:06:37.656] dataset_utils.py:379 [t:8344509248]: current eval_state: Doing\n",
      "[INFO][2024-06-14 11:07:07.938] dataset_utils.py:379 [t:8344509248]: current eval_state: Doing\n",
      "[INFO][2024-06-14 11:07:38.215] dataset_utils.py:379 [t:8344509248]: current eval_state: Doing\n",
      "[INFO][2024-06-14 11:08:08.480] dataset_utils.py:379 [t:8344509248]: current eval_state: Doing\n",
      "[INFO][2024-06-14 11:08:38.744] dataset_utils.py:379 [t:8344509248]: current eval_state: Doing\n",
      "[INFO][2024-06-14 11:09:09.090] dataset_utils.py:379 [t:8344509248]: current eval_state: Doing\n",
      "[INFO][2024-06-14 11:09:39.367] dataset_utils.py:379 [t:8344509248]: current eval_state: Doing\n",
      "[INFO][2024-06-14 11:10:09.661] dataset_utils.py:379 [t:8344509248]: current eval_state: Doing\n",
      "[INFO][2024-06-14 11:10:39.958] dataset_utils.py:379 [t:8344509248]: current eval_state: Doing\n",
      "[INFO][2024-06-14 11:11:10.269] dataset_utils.py:379 [t:8344509248]: current eval_state: Doing\n",
      "[INFO][2024-06-14 11:11:40.797] dataset_utils.py:379 [t:8344509248]: current eval_state: Doing\n",
      "[INFO][2024-06-14 11:12:11.062] dataset_utils.py:379 [t:8344509248]: current eval_state: Doing\n",
      "[INFO][2024-06-14 11:12:41.317] dataset_utils.py:379 [t:8344509248]: current eval_state: Doing\n",
      "[INFO][2024-06-14 11:13:11.641] dataset_utils.py:379 [t:8344509248]: current eval_state: Doing\n",
      "[INFO][2024-06-14 11:13:41.943] dataset_utils.py:379 [t:8344509248]: current eval_state: Doing\n",
      "[INFO][2024-06-14 11:14:12.329] dataset_utils.py:379 [t:8344509248]: current eval_state: DoingWithManualBegin\n",
      "[INFO][2024-06-14 11:14:12.332] dataset_utils.py:397 [t:8344509248]: get result dataset id ds-nrndbniqz4swucd2\n",
      "[INFO][2024-06-14 11:14:12.334] dataset.py:408 [t:8344509248]: no data source was provided, construct\n",
      "[INFO][2024-06-14 11:14:12.336] dataset.py:282 [t:8344509248]: construct a qianfan data source from existed id: ds-nrndbniqz4swucd2, with args: {'is_download_to_local': True}\n",
      "[WARNING][2024-06-14 11:14:12.805] baidu_qianfan.py:750 [t:8344509248]: parameter \"is_download_to_local\" has been set as deprecated\n",
      "[INFO][2024-06-14 11:14:13.292] baidu_qianfan.py:359 [t:8344509248]: no cache was found, download cache\n",
      "[INFO][2024-06-14 11:14:13.758] baidu_qianfan.py:285 [t:8344509248]: get dataset info succeeded for dataset id ds-nrndbniqz4swucd2\n",
      "[INFO][2024-06-14 11:14:13.760] utils.py:711 [t:8344509248]: start to export dataset\n",
      "[INFO][2024-06-14 11:14:14.553] utils.py:715 [t:8344509248]: create dataset export task successfully\n",
      "[INFO][2024-06-14 11:14:16.559] utils.py:720 [t:8344509248]: polling export task status\n",
      "[INFO][2024-06-14 11:14:17.106] utils.py:728 [t:8344509248]: export status: 1, keep polling\n",
      "[INFO][2024-06-14 11:14:19.110] utils.py:720 [t:8344509248]: polling export task status\n",
      "[INFO][2024-06-14 11:14:19.511] utils.py:728 [t:8344509248]: export status: 1, keep polling\n",
      "[INFO][2024-06-14 11:14:21.514] utils.py:720 [t:8344509248]: polling export task status\n",
      "[INFO][2024-06-14 11:14:21.895] utils.py:728 [t:8344509248]: export status: 1, keep polling\n",
      "[INFO][2024-06-14 11:14:23.897] utils.py:720 [t:8344509248]: polling export task status\n",
      "[INFO][2024-06-14 11:14:24.294] utils.py:728 [t:8344509248]: export status: 1, keep polling\n",
      "[INFO][2024-06-14 11:14:26.297] utils.py:720 [t:8344509248]: polling export task status\n",
      "[INFO][2024-06-14 11:14:26.722] utils.py:728 [t:8344509248]: export status: 1, keep polling\n",
      "[INFO][2024-06-14 11:14:28.729] utils.py:720 [t:8344509248]: polling export task status\n",
      "[INFO][2024-06-14 11:14:29.170] utils.py:725 [t:8344509248]: export succeed\n",
      "[INFO][2024-06-14 11:14:29.478] utils.py:659 [t:8344509248]: get export records succeeded for dataset id ds-nrndbniqz4swucd2\n",
      "[INFO][2024-06-14 11:14:29.480] utils.py:673 [t:8344509248]: latest dataset with time2024-06-14 11:14:27 for dataset ds-nrndbniqz4swucd2\n",
      "[INFO][2024-06-14 11:14:29.483] utils.py:739 [t:8344509248]: start to download file from url https://bj.bcebos.com/easydata-upload/_easydata-download_/6c6093c96f0241c087af184cc5729de8/%E8%AF%84%E4%BC%B0%E4%BB%BB%E5%8A%A1_model_run_I6nfm3LiMO_%E7%BB%93%E6%9E%9C%E9%9B%86_52d091V1_20240614_111414.zip?authorization=bce-auth-v1%2F50c8bb753dcb4e1d8646bb1ffefd3503%2F2024-06-14T03%3A14%3A29Z%2F3600%2Fhost%2F579752f49fd62dc4286d9bef1e52995e343780d6c72f983c2f09c606aea912b3\n",
      "[INFO][2024-06-14 11:14:29.753] baidu_qianfan.py:300 [t:8344509248]: download dataset zip to .qf_cache/dataset/.qianfan_download_cache/dg-hnuxpbnnw6wu4hz2/ds-nrndbniqz4swucd2/1/bin.zip succeeded\n",
      "[INFO][2024-06-14 11:14:29.769] baidu_qianfan.py:325 [t:8344509248]: unzip dataset to path .qf_cache/dataset/.qianfan_download_cache/dg-hnuxpbnnw6wu4hz2/ds-nrndbniqz4swucd2/1/content successfully\n",
      "[INFO][2024-06-14 11:14:29.770] baidu_qianfan.py:329 [t:8344509248]: write dataset info to path .qf_cache/dataset/.qianfan_download_cache/dg-hnuxpbnnw6wu4hz2/ds-nrndbniqz4swucd2/1/info.json successfully\n",
      "[INFO][2024-06-14 11:14:30.195] utils.py:418 [t:8344509248]: need create cached arrow file for /Users/zhonghanjun/pywp/bce-qianfan-sdk/cookbook/finetune/.qf_cache/dataset/.qianfan_download_cache/dg-hnuxpbnnw6wu4hz2/ds-nrndbniqz4swucd2/1/content/dataset.jsonl\n",
      "[INFO][2024-06-14 11:14:30.197] utils.py:465 [t:8344509248]: start to write arrow table to .qf_cache/dataset/Users/zhonghanjun/pywp/bce-qianfan-sdk/cookbook/finetune/.qf_cache/dataset/.qianfan_download_cache/dg-hnuxpbnnw6wu4hz2/ds-nrndbniqz4swucd2/1/content/dataset.arrow\n",
      "[INFO][2024-06-14 11:14:30.215] utils.py:481 [t:8344509248]: writing succeeded\n",
      "[INFO][2024-06-14 11:14:30.218] utils.py:349 [t:8344509248]: start to get memory_map from .qf_cache/dataset/Users/zhonghanjun/pywp/bce-qianfan-sdk/cookbook/finetune/.qf_cache/dataset/.qianfan_download_cache/dg-hnuxpbnnw6wu4hz2/ds-nrndbniqz4swucd2/1/content/dataset.arrow\n",
      "[INFO][2024-06-14 11:14:30.220] utils.py:277 [t:8344509248]: has got a memory-mapped table\n",
      "[INFO][2024-06-14 11:14:30.226] utils.py:465 [t:8344509248]: start to write arrow table to .qf_cache/dataset/.mapper_cache/Users/zhonghanjun/pywp/bce-qianfan-sdk/cookbook/finetune/.qf_cache/dataset/.qianfan_download_cache/dg-hnuxpbnnw6wu4hz2/ds-nrndbniqz4swucd2/1/content_262c088c-0aba-4c51-b09c-de04530c4a9c.arrow\n",
      "[INFO][2024-06-14 11:14:30.229] utils.py:481 [t:8344509248]: writing succeeded\n",
      "[INFO][2024-06-14 11:14:30.230] utils.py:349 [t:8344509248]: start to get memory_map from .qf_cache/dataset/.mapper_cache/Users/zhonghanjun/pywp/bce-qianfan-sdk/cookbook/finetune/.qf_cache/dataset/.qianfan_download_cache/dg-hnuxpbnnw6wu4hz2/ds-nrndbniqz4swucd2/1/content_262c088c-0aba-4c51-b09c-de04530c4a9c.arrow\n"
     ]
    }
   ],
   "source": [
    "from qianfan.model import Model\n",
    "from qianfan.dataset import Dataset\n",
    "\n",
    "# 首先需要先加载测试数据集，这里以加载刚上传的训练集为例子：\n",
    "test_ds = Dataset.load(qianfan_dataset_id=qianfan_data_source.id)\n",
    "\n",
    "# 从训练结果中获取模型对象\n",
    "m: Model = trainer.output[\"model\"]\n",
    "\n",
    "# 运行批量任务获取结果数据集\n",
    "result_ds: Dataset = m.batch_inference(test_ds)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "通过这种方式运行完成后，可以直接在本地拿到一份批量运行的结果，我们可以通过dataset.list查看其中的部分数据："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "[INFO][2024-06-14 11:16:28.242] dataset.py:994 [t:8344509248]: list local dataset data by [3, 4, 5]\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "[{'prompt': '下文中需求增加导致市场价格提升事件对应的结果涉及的产品是？随着供应缩减，需求回升，高碳铬铁价格企稳回升，预期短期内或仍维持偏强态势，但高碳铬铁产能依然偏向过剩，后期价格弹升空间有限',\n",
       "  'input_prompt': '下文中需求增加导致市场价格提升事件对应的结果涉及的产品是？随着供应缩减，需求回升，高碳铬铁价格企稳回升，预期短期内或仍维持偏强态势，但高碳铬铁产能依然偏向过剩，后期价格弹升空间有限',\n",
       "  'llm_output': '根据上文描述，涉及的产品是高碳铬铁。',\n",
       "  'expected_output': '高碳铬铁'},\n",
       " {'prompt': '下文中有哪些因果事件？东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨',\n",
       "  'input_prompt': '下文中有哪些因果事件？东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产，原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨###东南亚棕榈油减产,原油价格大幅上涨',\n",
       "  'llm_output': '上述文本描述了东南亚棕榈油减产和原油价格大幅上涨这两个事件，并且它们之间存在因果关系。文本中多次提到这两个事件，表达的意思是东南亚棕榈油减产导致了原油价格的上涨。因此，文本中的因果事件是：东南亚棕榈油减产导致了原油价格的大幅上涨。',\n",
       "  'expected_output': '供给减少导致市场价格提升'},\n",
       " {'prompt': '下文中市场价格下降导致销量（消费）减少事件对应的原因涉及的产品是？而发改委分别在5月10日和6月9日对国内汽柴油价格进行了下调，汽油和柴油零售价格分别累计下调了860和820元/吨，导致石化双雄（尤其是中石化）不仅炼油损逐月扩大、成品油销售受阻并且出现库存跌价',\n",
       "  'input_prompt': '下文中市场价格下降导致销量（消费）减少事件对应的原因涉及的产品是？而发改委分别在5月10日和6月9日对国内汽柴油价格进行了下调，汽油和柴油零售价格分别累计下调了860和820元/吨，导致石化双雄（尤其是中石化）不仅炼油损逐月扩大、成品油销售受阻并且出现库存跌价',\n",
       "  'llm_output': '根据提供的信息，发改委对国内汽柴油价格进行了下调，导致石化双雄（尤其是中石化）的成品油销售受阻并且出现库存跌价。因此，涉及的产品是汽柴油。汽柴油价格的下降导致了销量减少的事件，原因是消费者可能会因为价格下降而推迟购买或者减少购买量，等待价格进一步下降。',\n",
       "  'expected_output': '汽油,柴油'}]"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "result_ds.list([i for i in range(3,6)])"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在完成模型的批量运行后，我们可以对模型有一个简单的体感评估，如果效果不错，我们可以选择发布成服务以最终应用生产："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "[INFO][2024-06-14 11:16:56.899] model.py:518 [t:8344509248]: ready to deploy service with model am-eush4fhk3ccb/amv-8sgyr4aqptvs\n",
      "[INFO][2024-06-14 11:22:07.557] model.py:575 [t:8344509248]: service svco-nxavbjiyqanc has been deployed in `o0luzbiw_sdkcorpus` \n"
     ]
    }
   ],
   "source": [
    "#-# cell_skip\n",
    "from qianfan.model import Service\n",
    "from qianfan.model.consts import ServiceType\n",
    "from qianfan.resources.console.consts import DeployPoolType\n",
    "\n",
    "sft_svc: Service = m.deploy(DeployConfig(\n",
    "    name=\"spcorpus\",\n",
    "    endpoint_suffix=\"sdkcorpus\",\n",
    "    replicas=1, # 副本数， 与qps强绑定\n",
    "    pool_type=DeployPoolType.PrivateResource, # 私有资源池\n",
    "    service_type=ServiceType.Chat,\n",
    "    hours=1,\n",
    "))\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "使用Finetune之后的模型服务和原始的预置模型服务调用："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'下文中的因果事件如下：\\n\\n1. 无取向硅钢广泛应用于铁芯等电机零部件。\\n2. 无取向硅钢产量的持续提升导致市场竞争愈发激烈。\\n3. 市场竞争激烈导致无取向硅钢价格进一步降低。\\n4. 无取向硅钢价格降低有效减少新能源汽车驱动电机行业内企业的成本支出。'"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#-# cell_skip\n",
    "from qianfan import ChatCompletion\n",
    "### 使用Model & Service调用模型\n",
    "\n",
    "problem=\"下文中有哪些因果事件？无取向硅钢广泛应用于铁芯等电机零部件其产量的持续提升导致市场竞争愈发激烈，价格进一步降低，从而有效减少新能源汽车驱动电机行业内企业的成本支出\"\n",
    "\n",
    "#获取服务对象，即ChatCompletion等类型的对象\n",
    "chat_comp: ChatCompletion = sft_svc.get_res()\n",
    "sft_chat_resp = chat_comp.do(messages=[{\"content\": problem, \"role\": \"user\"}])\n",
    "sft_chat_resp[\"result\"]"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "base",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.5"
  },
  "orig_nbformat": 4,
  "vscode": {
   "interpreter": {
    "hash": "58f7cb64c3a06383b7f18d2a11305edccbad427293a2b4afa7abe8bfc810d4bb"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
